6 research outputs found
Dual Language Models for Code Switched Speech Recognition
In this work, we present a simple and elegant approach to language modeling
for bilingual code-switched text. Since code-switching is a blend of two or
more different languages, a standard bilingual language model can be improved
upon by using structures of the monolingual language models. We propose a novel
technique called dual language models, which involves building two
complementary monolingual language models and combining them using a
probabilistic model for switching between the two. We evaluate the efficacy of
our approach using a conversational Mandarin-English speech corpus. We prove
the robustness of our model by showing significant improvements in perplexity
measures over the standard bilingual language model without the use of any
external information. Similar consistent improvements are also reflected in
automatic speech recognition error rates.Comment: Accepted at Interspeech 201
Contextual Label Projection for Cross-Lingual Structure Extraction
Translating training data into target languages has proven beneficial for
cross-lingual transfer. However, for structure extraction tasks, translating
data requires a label projection step, which translates input text and obtains
translated labels in the translated text jointly. Previous research in label
projection mostly compromises translation quality by either facilitating easy
identification of translated labels from translated text or using word-level
alignment between translation pairs to assemble translated phrase-level labels
from the aligned words. In this paper, we introduce CLAP, which first
translates text to the target language and performs contextual translation on
the labels using the translated text as the context, ensuring better accuracy
for the translated labels. We leverage instruction-tuned language models with
multilingual capabilities as our contextual translator, imposing the constraint
of the presence of translated labels in the translated text via instructions.
We compare CLAP with other label projection techniques for creating
pseudo-training data in target languages on event argument extraction, a
representative structure extraction task. Results show that CLAP improves by
2-2.5 F1-score over other methods on the Chinese and Arabic ACE05 datasets.Comment: Work in Progres
GENEVA: Pushing the Limit of Generalizability for Event Argument Extraction with 100+ Event Types
Numerous events occur worldwide and are documented in the news, social media,
and various online platforms in raw text. Extracting useful and succinct
information about these events is crucial to various downstream applications.
Event Argument Extraction (EAE) deals with the task of extracting
event-specific information from natural language text. In order to cater to new
events and domains in a realistic low-data setting, there is a growing urgency
for EAE models to be generalizable. Consequentially, there is a necessity for
benchmarking setups to evaluate the generalizability of EAE models. But most
existing benchmarking datasets like ACE and ERE have limited coverage in terms
of events and cannot adequately evaluate the generalizability of EAE models. To
alleviate this issue, we introduce a new dataset GENEVA covering a diverse
range of 115 events and 187 argument roles. Using this dataset, we create four
benchmarking test suites to assess the model's generalization capability from
different perspectives. We benchmark various representative models on these
test suites and compare their generalizability relatively. Finally, we propose
a new model SCAD that outperforms the previous models and serves as a strong
benchmark for these test suites.Comment: 13 pages, 10 figure